NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

When is Differentially Private Finetuning Private?

Rinberg, Roy; Pawelczyk, Martin (October 2024, Statistical Foundations of LLMs and Foundation Models (NeurIPS 2024 Workshop))

Differential Privacy (DP) is a mathematical definition that enshrines a formal guarantee that the output of a query does not depend greatly on any individual in the dataset. DP does not formalize a notion of "background information" and does not provide a guarantee about how much an output can be identifying to someone who has background information about an individual. In this paper, we argue that privately fine-tuning a pre-trained machine learning model on a private dataset using differential privacy does not always yield meaningful notions of privacy. Simply offering differential privacy guarantees in terms of (ε, δ) is insufficient to ensure human notions privacy, when the original training data is correlated with the fine-tuning dataset. We emphasize that, alongside differential privacy assurances, it is essential to report measures of dataset similarity and model attackability (for which model-size can be a proxy). This is a work in progress; this work is primarily a position piece, arguing for how DP should be used in practice, and what future research needs to be conducted in order to better answer those questions.
more » « less
Full Text Available
Data-Unlearn-Bench: Making Evaluating Data Unlearning Easy

Rinberg, Roy; Puigdemont, Pol; Pawelczyk, Martin; Cevher, Volkan (June 2024, ICML 2025 Workshop on Machine Unlearning for Generative AI (https://openreview.net/group?id=ICML.cc/2025/Workshop/MUGen)

Evaluating machine unlearning methods remains technically challenging, with recent benchmarks requiring complex setups and significant engineering overhead. We introduce a unified and extensible benchmarking suite that simplifies the evaluation of unlearning algorithms using the KLoM (KL divergence of Margins) metric. Our framework provides precomputed model ensembles, oracle outputs, and streamlined infrastructure for running evaluations out of the box. By standardizing setup and metrics, it enables reproducible, scalable, and fair comparison across unlearning methods. We aim for this benchmark to serve as a practical foundation for accelerating research and promoting best practices in machine unlearning. Our code and data are publicly available.
more » « less
Full Text Available
OpenXAI: Towards a Transparent Evaluation of Model Explanations

Agarwal, Chirag; Krishna, Satyapriya; Saxena, Eshika; Pawelczyk, Martin; Johnson, Nari; Puri, Isha; Zitnik, Marinka; Lakkaraju, Himabindu (December 2023, Advances in neural information processing systems)
On the Privacy Risks of Algorithmic Recourse

Pawelczyk, Martin; Lakkaraju, Himabindu; Neel, Seth (April 2023, International Conference on Artificial Intelligence and Statistics)
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse

Pawelczyk, Martin; Datta, Teresa; van-den-Heuvel, Johannes; Kasneci, Gjergji; Lakkaraju, Himabindu. (May 2023, International Conference on Learning Representations)
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse

Pawelczyk, Martin; Datta, Teresa; Van-den-Heuvel, Johannes; Kasneci, Gjergji; Lakkaraju, Himabindu (April 2023, International Conference on Learning Representations (ICLR))
OpenXAI: Towards a Transparent Evaluation of Model Explanations

Agarwal, Chirag; Krishna, Satyapriya; Saxena, Eshika; Pawelczyk, Martin; Johnson, Nari; Puri, Isha; Zitnik, Marinka; Lakkaraju, Himabindu (October 2022, Advances in neural information processing systems)

Full Text Available
Exploring Counterfactual Explanations Through the Lens of Adversarial Examples: A Theoretical and Empirical Analysis

Pawelczyk, Martin; Agarwal, Chirag; Joshi, Shalmali; Upadhyay, Sohini; Lakkaraju, Himabindu. (January 2022, International Conference on Artificial Intelligence and Statistics)

Full Text Available

Search for: All records